What makes Happiness?

What makes Happiness?#

For a long time, people have been interested in what makes us happy and how we can improve well-being in society. One big question is whether having more money or wealth actually makes people happier, or if there are other things that matter more. We often hear the saying “money can’t buy happiness,” but the real answer is a bit more complicated than that.

In this data story, we will explore how happiness relates to different economic and social factors using information from the World Happiness Report 2019 and World Development Indicators. This report looks at how happy people are in different countries and compares that with demographic variables like GDP per capita, Gross National Income (GNI), the gini index, unemployment rates, education levels, and life expectancy.

We want to find out if richer countries really have happier people, and if so, how strong this connection is. But we will also look beyond money to see how things like having a job, going to school, and living a long, healthy life affect happiness. For example, being unemployed might make people less happy even if their country is wealthy, and having a good education could improve well-being in ways that money alone can’t.

First, we will compare happiness scores with income indicators like GDP per capita and GNI. Then, we will analyze how unemployment rates relate to happiness. After that, we will look at the gini index, education and life expectancy to see how these factors could affect happiness scores across countries.

By comparing these different aspects, we hope to better understand what really contributes to happiness around the world. This will help us see whether the saying “money buys happiness” really holds. And if it doesn’t hold, we could find out what does make happiness.Hallo

../_images/e951cea0c9a055dd554c612431f0ddfb45c409517c9accbe9849f6711307fb7b.png
../_images/009a1996efe9179d02f21e98133de0357ed23984458cb13bbf645f34bfbcf2cc.png
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns


happiness_df = pd.read_csv("2019happy.csv")
gdp_df = pd.read_csv("WDI_new.csv")


happiness_df = happiness_df[['Country or region', 'Score']]
happiness_df.rename(columns={'Country or region': 'Country', 'Score': 'Happiness Score'}, inplace=True)


life_exp_df = gdp_df[gdp_df['Series Name'] == 'Life expectancy at birth, total (years)']
life_exp_df = life_exp_df[['Country Name', '2019 [YR2019]']]
life_exp_df.rename(columns={'Country Name': 'Country', '2019 [YR2019]': 'Life Expectancy'}, inplace=True)
life_exp_df['Life Expectancy'] = pd.to_numeric(life_exp_df['Life Expectancy'], errors='coerce')


merged_df = pd.merge(happiness_df, life_exp_df, on='Country')
merged_df.dropna(inplace=True)


merged_df = merged_df.sort_values('Life Expectancy')


plt.figure(figsize=(14, 8))
sns.set(style="whitegrid")

plt.plot(
    merged_df['Life Expectancy'],
    merged_df['Happiness Score'],
    marker='o',
    color='coral',
    linewidth=2,
    markersize=6
)

plt.xticks(fontsize=12)
plt.yticks(range(2, 11), fontsize=12)
plt.ylim(2, 10)

plt.xlabel('Life Expectancy at Birth (Years)', fontsize=13)
plt.ylabel('Life Satisfaction (0–10)', fontsize=13)
plt.title('Self-reported Life Satisfaction vs. Life Expectancy (Line Plot)', fontsize=16, weight='bold')

plt.grid(True, linestyle='--', alpha=0.5)
sns.despine()
plt.tight_layout()
plt.show()
../_images/cbd6b681308e1b70883f683aa14896b7ca90895ab3540a05c1332ac08b2e893a.png
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

happiness_df = pd.read_csv("2019happy.csv")
wdi_df = pd.read_csv("WDI_new.csv")

happiness_df = happiness_df[['Country or region', 'Score']]
happiness_df.rename(columns={'Country or region': 'Country', 'Score': 'Happiness Score'}, inplace=True)

gini_df = wdi_df[wdi_df['Series Name'] == 'Gini index']
gini_df = gini_df[['Country Name', '2019 [YR2019]']]
gini_df.rename(columns={'Country Name': 'Country', '2019 [YR2019]': 'Gini Index'}, inplace=True)
gini_df['Gini Index'] = pd.to_numeric(gini_df['Gini Index'], errors='coerce')

merged_df = pd.merge(happiness_df, gini_df, on='Country')
merged_df.dropna(inplace=True)

plt.figure(figsize=(12, 7))
sns.kdeplot(
    x=merged_df['Gini Index'],
    y=merged_df['Happiness Score'],
    cmap="viridis",
    fill=True,
    bw_adjust=0.5,
    thresh=0.05
)

plt.xlabel('Gini Index (Income Inequality)', fontsize=13)
plt.ylabel('Happiness Score (0–10)', fontsize=13)
plt.title('Density Heatmap of Happiness vs. Income Inequality', fontsize=16, weight='bold')
plt.grid(True, linestyle='--', alpha=0.5)
plt.tight_layout()
plt.show()
../_images/364ecca48f4087bd890dcbc80ad07125fb3abfe555b211dbd9a5ce8ff051f76c.png
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

happiness_df = pd.read_csv("2019happy.csv")
wdi_df = pd.read_csv("WDI_new.csv")

happiness_df = happiness_df[['Country or region', 'Score']]
happiness_df.rename(columns={'Country or region': 'Country', 'Score': 'Happiness Score'}, inplace=True)

edu_df = wdi_df[wdi_df['Series Name'] == 'Government expenditure on education, total (% of government expenditure)']
edu_df = edu_df[['Country Name', '2019 [YR2019]']]
edu_df.rename(columns={'Country Name': 'Country', '2019 [YR2019]': 'Education Spending (%)'}, inplace=True)
edu_df['Education Spending (%)'] = pd.to_numeric(edu_df['Education Spending (%)'], errors='coerce')

merged_df = pd.merge(happiness_df, edu_df, on='Country')
merged_df.dropna(inplace=True)

plt.figure(figsize=(12, 7))
sns.set(style="whitegrid")

sns.scatterplot(
    data=merged_df,
    x='Education Spending (%)',
    y='Happiness Score',
    color='mediumseagreen',
    s=100,
    edgecolor='white'
)

plt.xlabel('Government Education Spending (% of Gov. Budget)', fontsize=13)
plt.ylabel('Happiness Score (0–10)', fontsize=13)
plt.title('Happiness Score vs. Education Spending (Scatter Plot)', fontsize=16, weight='bold')
plt.grid(True, linestyle='--', alpha=0.5)
plt.tight_layout()
plt.show()
../_images/d60ff03322d4a74d00e8c5e19bc584c0f1304488afbb478737f732a6121493f0.png
import pandas as pd
import plotly.express as px
import plotly.graph_objects as go

def format_gni_label(x):
    if x >= 1_000_000:
        return f"${x/1_000_000:.1f}M"
    elif x >= 1_000:
        return f"${int(x/1_000)}K"
    else:
        return f"${int(x)}"

happy_df = pd.read_csv('2019happy.csv')
happy_df = happy_df.rename(columns={
    'Country or region': 'Country',
    'Score': 'Happiness Score',
    'GDP per capita': 'GDP per capita'
})

wdi_df = pd.read_csv('WDI_new.csv')

gni_df = wdi_df[
    (wdi_df['Series Name'] == 'GNI per capita (constant LCU)') &
    (wdi_df['2019 [YR2019]'].notna())
][['Country Name', '2019 [YR2019]']]

gni_df = gni_df.rename(columns={
    'Country Name': 'Country',
    '2019 [YR2019]': 'GNI per capita'
})

merged_df = pd.merge(happy_df, gni_df, on='Country', how='inner')

merged_df['GNI per capita'] = pd.to_numeric(merged_df['GNI per capita'], errors='coerce')
filtered_df = merged_df.dropna(subset=['GNI per capita', 'Happiness Score'])

filtered_df['GNI_bin'] = pd.qcut(filtered_df['GNI per capita'], q=16, duplicates='drop')

bin_edges = filtered_df['GNI_bin'].cat.categories
clean_labels = [
    f"{format_gni_label(interval.left)} - {format_gni_label(interval.right)}"
    for interval in bin_edges
]
filtered_df['GNI_bin_str'] = filtered_df['GNI_bin'].cat.rename_categories(clean_labels)

avg_happiness_per_bin = filtered_df.groupby('GNI_bin_str')['Happiness Score'].mean().reset_index()

fig = px.bar(
    avg_happiness_per_bin,
    x='GNI_bin_str',
    y='Happiness Score',
    title='Average Happiness Score per GNI per Capita Bin',
    labels={'GNI_bin_str': 'GNI per Capita Range', 'Happiness Score': 'Average Happiness Score'},
    text=avg_happiness_per_bin['Happiness Score'].round(2),
    height=600
)

fig.add_trace(
    go.Scatter(
        x=avg_happiness_per_bin['GNI_bin_str'],
        y=avg_happiness_per_bin['Happiness Score'],
        mode='lines+markers',
        line=dict(color='darkgrey', width=3, shape='spline'),
        marker=dict(size=8),
        name='Trend Line'
    )
)

fig.update_layout(xaxis_tickangle=45)
fig.show()
C:\Users\danie\AppData\Local\Temp\ipykernel_5968\4204006863.py:37: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  filtered_df['GNI_bin'] = pd.qcut(filtered_df['GNI per capita'], q=16, duplicates='drop')
C:\Users\danie\AppData\Local\Temp\ipykernel_5968\4204006863.py:44: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  filtered_df['GNI_bin_str'] = filtered_df['GNI_bin'].cat.rename_categories(clean_labels)
C:\Users\danie\AppData\Local\Temp\ipykernel_5968\4204006863.py:46: FutureWarning: The default of observed=False is deprecated and will be changed to True in a future version of pandas. Pass observed=False to retain current behavior or observed=True to adopt the future default and silence this warning.
  avg_happiness_per_bin = filtered_df.groupby('GNI_bin_str')['Happiness Score'].mean().reset_index()